Biostatistics For Dummies (Monika Wahi John Pezzullo)

Chapter 9

Summarizing and Graphing Your Data

IN THIS CHAPTER

Representing categorical data

Characterizing numerical variables

Putting numerical summaries into tables

Displaying numerical variables with bars and graphs

A large study can involve thousands of participants, hundreds of variables, and millions of individual

data points. You need to summarize this ocean of individual values for each variable down to a few

numbers, called summary statistics, that give readers an idea of what the whole collection of numbers

looks like — that is, how they’re distributed.

When presenting your results, you usually want to arrange these summary statistics into tables that

describe how the variables change over time or differ between categories, or how two or more

variables are related to each other. And, because a picture really is worth a thousand words, you will

want to display these distributions, changes, differences, and relationships graphically. In this chapter,

we show you how to summarize and graph both categorical and numerical data. Note: This chapter

doesn’t cover time-to-event (survival) data, which is the topic of Chapter 22.

Summarizing and Graphing Categorical Data

A categorical variable is summarized by tallying the number of participants in each category and

expressing this number as a count. You might also compute a percentage of the total number of

participants in all categories combined. So a sample of 422 participants can be summarized by health

insurance type, as shown in Table 9-1.

TABLE 9-1 Study Participants Categorized by Health Insurance Type

Health Insurance Type Count Percent of Total

Commercial

128

30.3%

Public

141

33.4%

Military

16.6%

Other

19.7%

Total

422

100%

The joint distribution of participants between two categorical variables is summarized by a cross-

tabulation (or cross-tab). Table 9-2 shows an example of a cross-tab of the same participants in our

example with type of health insurance on one axis, and urban-rural classification of their residence on

the other.